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^ , Abstract 

o 

^ ' The phenomenon of human language is widely studied from various 

\ points of view. It is interesting not only for social scientists, antropol- 

. ' ogists or philosophers, but also for those, interesting in the network 

dynamics. In several recent papers word web, or language as a graph 

^ (— i has been investigated fU [2f • 

O I In this paper I revise recent studies of syntactical word web [IJ df. 

I present a model of growing network in which such processes as node 
addition, edge rewiring and new link creation are taken into account. 

■ I argue, that this model is a satisfactory minimal model explaining 

[ measured data {J] \5$. 

I PACS 87.10. +e, 89.20.-a,89.75.Da, 89.75.Hc 

j keywords: scale free networks, word web 

o\ 
o 
o 
o 

> 

X 



1 



1 Introduction 



Networks are nowadays very popular to investigate. They are good models 
for various types of interactions, such as social interactions, professional 
interactions [6], interactions in biology [7j, interaction as communication 
0[9] to which belong also interactions of people through language [U [21 IH IS] • 
Networks are also an interesting objects to study theoretically, because their 
properties are strongly influenced by the network history and dynamics. 
Network can grow with time by node addition, the nodes can extinct, too. 
Several questions has been asked about the details of the net dynamics. 
For example, how the dynamics influences the overall network structure 
|1C H lll j [T3]. or what is the dynamics governing real networks [12J? 

Network is a collection of nodes interacting through edges. Binary undi- 
rected networks are the simplest ones; the edge between two nodes either 
exists or not. Networks are usually characterized by several local and global 
measures jfij. The most important local measures are clustering coefficient 
C and the average node degree k. Mathematically the clustering C, of the 
node i is a probability, that the two neighbours of node i are mutual neigh- 
bours as well. Network clustering coefficient C represents an average of all 
Cj-s. Clustering coefficient is in fact a measure of nontrivial "structure" in 
the network. By non trivial is meant, that the network is not a tree or a 
simple regular lattice with nearest neighbour connections, only. As a global 
measure node separation I (average shortest path between randomly chosen 
sites) is typically used. Separation of nodes shows, how "close" is one node 
to the other, or, in other words, how well the nodes communicate through 
edges. 

Special type of network is a small world network [6]. It's structure op- 
timises between the local regularity preservation, which tends to enhance 
node separation I and good global node communication through random 
shortcuts. In this networks a high clustering coefficient C, is combined with 
a low node separation I. 

As have been mentioned above, networks usually change their size with 
time I13j . Many real networks, such as internet or word web, grow by the 
continual addition of new nodes. The node addition might be accompanied 
by node deletion, but the ratio of deleted nodes is often negligible. Therefore 
the dynamics of real networks is well captured by the models of growing nets 

ina damns]. 

Many recent studies have shown [7} [Tl 1121 114j . that real networks, which 
are created by self organized processes, have common features. Their static 
properties are similar to that of small world nets. On the other hand, their 
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degree distribution function, which is influenced by the dynamics, has power 
law character: 

P(k)ock^. (1) 

Such networks are called scale free |10j . The same properties has the word 
web [US]. 

In this paper a positional word web is studied [H 0]. Here the words 
are nodes and the word interaction is defined by the neighbourhood in a 
sentence. Language is a living phenomenon, developing all the time. Some 
words are created and some of them fall into disuse. Hence, to understand 
the word web dynamics, it is important to examine the dynamics of nets 
with changing number of sites. 

This paper is organized as follows: In Section 2 the question of scale free 
network structure and dynamics is studied. Third Section is devoted to the 
mathematical models of positional word web and in the Section 4 my word 
web model is presented. 



2 Scale free networks 

As has been mentioned in the previous section, scale free structure is a 
result of self organized network development. Therefore this process should 
be natural and simple. The nature of the processes leading to the scale free 
structure has been investigated by Barabasi and Albert in their fundamental 
paper |10j . In the Barabasi - Albert model (BA model) the growth of net 
starts from small bunch of interconnected nodes. Each time unit a node 
comes and adjoins itself to the old network by m new links. The probability 
of linking with certain old node i is proportional to its degree fej. Such type 
of linking is called preferential. 

There are several possibilities, how do describe this processes mathemat- 
ically. In many cases, the most efficient seems to be a continuous approach 
of Dorogovtsev and Mendes [13J. Newcoming nodes are labelled by their 
birth -time s. At time t, node s has, in average, k(s,t) neighbours. The 
average degree k(s,t), is given by the equation 

dk{s,t) = k(s,t) 
dt tik(s,t)ds- 

Here the rhs of the equation expresses how k(s, t) changes by the pref- 
erential linking. To find the solution, the sum of all node degrees expressed 
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as the renormalization integral j Q k(s, t)ds in the denominator, is to be es- 
timated. It is easy. Each time unit m new edges increase the sum by 2m. 
Therefore 



/ k(s, t)ds = 2mt. 
Jo 

Substituting ([3]) into ([2]), equation ([2]) is easily solved [13] : 



(3) 



k(s, t) oc 




oc s 



(4) 



where = |. Having k(s,t), power law degree distribution P(k) ([T]) is 
easily analytically calculated [13] . But to get ([JJ together with the scaling 
exponent 7, such calculations are not necessary. It has been proven [13] . 
that the exponents and 7 are related by the scaling relation 



From ([5]) and one gets 7_ba = 3 and thus scale free degree distribution is 



in the BA model. 

To summarize, preferential linking leads to the scale free network struc- 
ture Q. Several other types of node linking were investigated. It has been 
shown, that if the node linking is a variation of preferential connection, the 
structure is scale free, but with 7 7^ ^iba [H]- 

3 Positional word web 

Lexicon of human language is composed of several ten thousands words. In 
spite of the huge amount of concepts, human brain is capable to manage 
them very quickly. Our speech is fluent, we are capable of quick retrieval 
in the large word database. How is it possible? How is human lexicon 
implemented in a brain? Of course, there are several theories about it. One 
of them says, that the lexicon has a structure of small world graph [6, 2, 1, 5j. 

Let us imagine a graph consisting of words as a nodes. Each word 
is connected by some edges (interactions) to the other words. It seems 
reasonable to define an interaction by the two different manners, which lead 
to the two different word nets, namely conceptual [2] and positional [Ij. The 
first one is related to the semantics and the second one to the syntax. Both 




(5) 



P(k) oc k~ 3 



(6) 
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of them have small world properties, namely, large clustering combined with 
small node separation. 

Positional net is related to the syntax and reflects the co-occurrence of 
words in a sentence. The words (graph sites) are connected by an edge, if 
they are neighbours in a sentence. In the human lexicon, two subsets of 
different size are recognized, namely the kernel lexicon, and the rest. Kernel 
lexicon includes about ten thousand most frequent words, known to the 
majority of people speaking the language. The other part, having hundred 
thousand words, is used in the various specialized communities. Studies of 
positional word web show, that its distribution function P(k) scales as (pQ), 
but with two different exponents [4j. For well connected kernel words (k 
is great), the scaling exponent is close to the theoretically predicted value 
1b A = 3. Less connected words scale with 7 = 1.5. These two scaling 
regimes were explained by the model of Dorogovtsev and Mendes [4J (DM 
model). 

The model is as follows: Each time unit a node comes and links itself 
preferentially by m edges. Simultaneously ct new edges (that means 2ct 
edge ends, c << 1) are created and connect the old nodes with preference. 
In this case k(s, t) changes with time as: 

^M = (m + 2ct) t kM (7) 
where the integral gives the sum of node degrees 

/ k{s,t)ds = 2mt + ct 2 . (8) 
Jo 

With a help of © the solution of (0) is found @|: 

and two scaling regimes are recognised. For s « t (well connected words) 
Pdm = 3 and 7_DAf = 3, and for s ~ t (less connected words) (5dm = \ + § 
and 7-DAf = 1-5 ©. 

Let us check, how well the DM model describes measured data. The 
distribution P(k) measured by Cancho and Sole Q], as well as our own 
studies Fig.l [5] show, that there is a discrepancy between critical exponents 
predicted by the DM model and measured exponents. In the less steep part 
of the distribution 7 « 1.5 and is the same, or very close to jdm- But 7 
of the steeper part of the distribution shows the systematic error. In both 
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cases it is lower (2.7 pQ and 2.13 [5]) then ^ba = 3. I guess, that this is 
due to the fact, that the DG model doesn't include all processes significant 
for the word web. In the next section I propose a model which fit the data 
more accurately. 



10 



All versions (logarithmic binning distribution) 



x"*-., □ gamma = 1 .49 

x " -m 

x x □ 

•1 X x V 

2 - x '!,2 

*| ° 

o [ N(k) -«.|, 

p I asvbin.dat' t "Siq 



f - gamma = 2.13 



bevbin.dat 
kjvbin.dat 



nrsvbin .dat' □ ''I a B x 

-4 h drvbin.dat' ■ '*. ■ a 

fix) 

S(x) 



10 



Figure 1: Connectivity distribution for the positional word web constructed 
of several English versions of The Bible (log -log plot). Some of them, such 
as King James version (kjv), Douay Rheims version (drv) are old (kjv has 
been issued in the year 1611, drv is even older, 1582), the others (American 
standard version, asv, 1901; Basic English versin, bev, 1941; New revisited 
standard version, nrsv, 1989) are relatively modern, bev is special, because 
its text has been artificially simplified. It is reflected in slightly different 
distribution. 



4 Word web model 

What are the other events, that should be considered in the positional word 
web? New words are created and added to the vocabulary all the time. 
They are used in sentences. Simultaneously old words might be used in a 
new phrases or contexts. In the word web this means a creation of new 
edges among old nodes. Both events are included in the DM model (|7|). 
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What are the other possibilities? As the time flows, the meaning of word 
might change slightly (or even significantly). In the word web some of the 
old connections are broken and rewired. Edge rewiring can be preferential, 
random, or a combination of both. 

In [T3] another model with preferential attachment is analytically solved: 
Every time unit a node is coming and linked preferentially to m old nodes. 
In the same time m r = m r ^ p + m rr old nodes are randomly selected. One 
edge end of m r nodes is rewired, m riP of them are rewired with preference, 
and m r)r ends randomly. The model is solved and scaling exponent 7 is 
found: 

m — m r „ 

7 = 2+ r ' p . 10 

m + m rjP 

If m riP = 0, 7 = 7_ba- Because m r ^ > 0, rewiring lowers the 7 exponent and 
thus 7 < 7BA- 

To fit the measured data [HE]) (Fig.l) I designed a model which includes 
minimal amount of events. My model maintains two scaling regimes in the 
distribution function P(k) ([I]). Likewise it explains, why the 7 exponent of 
the steeper part of P{k) is below the BA value 3. In some sense, the model 
is a combination of a model with edge rewiring [13], and DM model [JJ. 

Again, each time unit a node is added and linked preferentially with m 
edges to the older nodes. Simultaneously another events occur: 

1. ct new edges are created and linked preferentially among old nodes; 

2. m r old nodes are randomly selected and one edge end of them is 
rewired preferentially. 

In the continuous approach these processes are described by the equation: 

dk(s,t) , . k(s,t) m r , . 

— = (m + 2ct + m r ) , v ' 11 

dt y J J^k(s,t)ds t v > 

To solve the equation (fTTI) . the integral ^k{s,t)ds should be specified. As 
in the previous model, it is a sum of all degrees in the net. This sum is 
changed only by the new link creation; rewiring left it unaffected. The edge 
creation processes are : 

a) Edge addition - m new links come each time unit with a new node. 

b) Appearance of new edges - ct new links, or 2ct new link ends appear 
each time unit among old nodes. 

The rewiring process is: 

c) m r nodes are randomly selected. Each of them loose one edge end. 
This is expressed in the element ^ , where the number of nodes at time 
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t is proportional to t. Each of these ends is rewired preferentially. 
Hence, the number of new edges, which appear in the network up to time 
t is exactly the same as in the DM model ([7]). Therefore 

/ k(u,t)du = 2mt + ct 2 . (12) 
Jo 

Of course, it is easy to get (|12p formally, by integrating both sides of (jlip 



* , dk(s,t) , . m r fnds , , 

ds — = (m + 2ct + m r ) — = m + 2ct (13) 

ot t 

and with a help of the expression 

— y k(s,t)ds = k(t,t) + y ds — = m + m + 2ct, (14) 

one identifies (fl~2j) . Substituting (fl~2j) into (fTTj) the equation is: 

dk(s.t) , „ . k(s,t) m r 

— i-L^ = ( m + 2ct + m r ) V ; 5 (15 

^ v J 2mt + ct 2 t y ' 

Because m r is a constant, for t — > 00, ^ — > 0. Using this ^5|) is simplified 
and analytically solved 

*.'>«G) t (£tI)" t '«) 

The solution of (jlip is similar to that of ([9]), but with different (3 exponents. 

-if s « t, (3 = 2^ and due to © 7 = 2 + 

-but if s ~ t, = 2- + 2^ = 2 and due to © 7 = 1.5. 

It is clear, that in my model the scaling exponent 7 is in the region of 
great k lower then the value jba = 3, but maintains the value 1.5 for the 
region of small k. This is exactly what was measured by Cancho and Sole 
and by us [HE], (Fig.l). The model (fTT|) seems to fit the data better, then 
the former DM model [3]. 

Let us speculate a little. Our measurement shows ^\,(Fig.l), that 
7 = 2.13 for great k. Let us suppose, that newborn word has about ten 
connections m ~ 10. In that case the number of rewired edge ends is 
m r « 7.7. 
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5 Conclusion 



In conclusion, I present a model of growing network, which includes sev- 
eral local events, such as preferential link addition and preferential link 
rewiring. The model qualitatively and quantitatively correctly describes 
measured word web data. My model is inspired by DM model of growing 
network [5J. Additional local processes of edge rewiring cause, that the scal- 
ing exponent 7 of distribution function ([jQ) is lower then the exponent of 
fully analytically solvable and well known BA model [10J ([2]). These local 
events are: 

a) random node exclusion, and 

b) preferential rewiring of one link end of the chosen node. 

In our word web this processes mean, that certain word looses one of 
its meaning, or context, and another one is used in different context. For 
example, the word "notebook" has denoted exercise book for children. Now 
it is used more in a context of computers and informatics. Another example: 
"computer" in fifties was a big device. To tell anybody to put the computer 
on the table was nonsense. Now it is perfectly OK. 

More detailed analysis of our data indicates, that the scaling exponent 
7 might be slightly lower then 1.5 for small k. This is also supported by our 
analysis of another texts [15J. I therefore suppose, that there are another 
processes, such as node aging [13J or random edge rewiring [11] that might 
play some role. To investigate their relevance is a future task. 
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